Time-frequency detection algorithm for gravitational wave bursts 
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An efficient algorithm is presented for the identification of short bursts of gravitational radiation 
in the data from broad-band interferometric detectors. The algorithm consists of three steps: pixels 
of the time-frequency representation of the data that have power above a fixed threshold are first 
identified. Clusters of such pixels that conform to a set of rules on their size and their proximity 
to other clusters are formed, and a final threshold is applied on the power integrated over all pixels 
in such clusters. Formal arguments are given to support the conjecture that this algorithm is very 
efficient for a wide class of signals. A precise model for the false alarm rate of this algorithm is 
presented, and it is shown using a number of representative numerical simulations to be accurate at 
the 1% level for most values of the parameters, with maximal error around 10%. 

PACS numbers: 04.80.Nn, 07.05.Kf, 95.55.Ym 



I. INTRODUCTION 



A number of large laser interferometers jjj are approaching sensitivities to gravitational waves in the ^ 10— 1000 Hz 
frequency band that could be sufficient for the detection of astrophysical events [Q] . The signals from these events will 
■ be buried deep in the instrumental noise, so that unambiguous detections will be possible only with highly efficient 
C*~) ' data processing algorithms. 

The focus of this article will be on transient sources of gravitational radiation, which will be defined as sources that 
have a relatively short duration (milliseconds to tens of seconds) and a bandwidth which overlaps at least partially 
with that of the interferometric detectors. A significant number of such transient sources have been theorized, with 
various levels of sophistication. For instance, the inspiral portion of coalescing compact binaries is well-understood 
(by post-newtonian expansion techniques ||), as is the ring-down portion if a black hole results from the coalescence 
j|, but the merger portion is understood at best only qualitatively [|. As the mass of the binary increases, the 
signal-to-noise ratio of the merger portion dominates that of the inspiral and ringdown portions; the coalescence of 
IOMq — 1000M Q black hole binaries could be visible at large distances, provided that the merger waveform could be 
detected with sufficient efficiency || [jj . The collapse of the core of massive stars could also produce detectable signals 
||; depending on the type of progenitor, bar modes, r-modes, fragmentation instabilities or black hole ringdowns could 
be important sources of gravitational radiation. The details of the waveforms for most of these different mechanisms 
are far from known; even in the best cases, only numerical simulations covering parts of the relevant physics are 
; I i available in the literature [^j. Hence, as it can be seen from the preceding examples, the amount of information about 
the gravitational wave signal from various sources varies considerably, and this variability is obviously reflected in the 
efficiency of the algorithms that can be designed for each class of sources. 

Only minimal assumptions about the signal will be made in this paper, and therefore the principal characteristic of 
the algorithm to be presented will be its robustness against poor modeling of the expected signal waveform. Stated 
differently, this algorithm will be moderately sensitive to a very large class of signals, by opposition to being very 
sensitive to only a few specific signals. It will correspondingly be useful to search for transient sources that do not 
have waveforms that are precisely predicted, and to characterize the non-Gaussian, transient component of the noise 
in the instruments. 

It is important when designing a detection algorithm to compare its efficacy to that of other designs; the development 
process can stop when an algorithm that outperforms all others is discovered. Various measures of this efficacy can be 
adopted, and various techniques to obtain the optimal algorithm be used, as it is discussed in section |n|. It is argued 
that the algorithm that is presented in this article, the TFCLUSTERS algorithm, has a structure which is close to that 
of the optimal detector for a large variety of signal classes. 



The TFCLUSTERS algorithm is explained in detail in section III. To summarize, it consists of the following four steps 



(i) The data y from a gravitational wave detector are transformed into a time-frequency representation with fixed 

time and frequency resolutions T and F, respectively; the instantaneous power at time t — iT and frequency 
/ = jF, estimated from this representation, is labeled Py(y). 

(ii) A threshold r\ is applied on the power, in order to retain only pixels with Pij(y) > rj. 
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(iii) Clusters of pixels with power above threshold are formed by grouping pixels sharing a common side; clusters 
that do not conform to a fixed set of rules on their size or distance to other clusters are discarded. 

(iv) A threshold on the sum of Pij(y) — rj over the surviving clusters is applied. Data segments containing clusters 
satisfying this threshold lead to the acceptance of the hypothesis that they contain a signal. 

In order to understand the operating characteristics of this algorithm, a simplified version [without the clustering 
analysis, i.e. consisting only of steps (i), (ii) and (iv)] is shown in section [V to maximize for all signals in R N the 



signal-to-noise ratio among all detectors that are based on the estimation of a lower bound of the signal power. This 
simplified algorithm is especially efficient for signals with a sparse representation in the time-frequency domain. Since 
most signals are expected to have pixels that present a high degree of spatial correlation in the time- frequency domain, 
the clustering analysis of TFCLUSTERS is an interesting way to capitalize on that property to filter out a large portion 
of the noise, as it is shown in section |v|. 

An analytical method for computing the false alarm rate associated with TFCLUSTERS is developed, with the details 



for the clustering analysis being presented in section VA. Using the computer-generated enumeration of all the 



possible clusters of a certain size that can be formed, it is shown that large clusters are exponentially unlikely to occur 
when only noise is present. The rate of occurrence of pairs of clusters separated by a certain distance is computed in an 
analogous manner. One example of a complete analysis of the performances of TFCLUSTERS for a short, narrow-band 
signal, including the optimization of its efficiency at fixed false alarm rate, is presented in section |VB| . The efficiency 
of TFCLUSTERS is compared to that of the (unrealistic) ideal power detector (which assumes a knowledge of the signal 
duration and central frequency) as a function of the signal-to-noise ratio of the signal; at fixed probability of detection, 
the reduction in signal-to-noise ratio for TFCLUSTERS is consistently less than about 30%. 



Numerical simulations (section VI) confirm that the analytical model for the false alarm rate is accurate in the regime 
of operation relevant to TFCLUSTERS, with errors on the predicted false alarm rates around 1% in most situations, 
and with maximum errors around 10%, due to the exclusion of higher-order terms in the theoretical modeling, or to 
unmodeled finite size effects. 



II. TRANSIENT DETECTION 



The transient detection problem consists in choosing between two possibilities: the observed data y(i), < t < T, 
consist of a signal term s(t) and a noise term n(t), where the noise is additive and is assumed to be white Gaussian 
with zero mean and unit variance |U)| |: 

y(t) = s(t) + en(t), (I) 

or the data are noise alone: 

y(t) = en(t). (2) 

In the simplest case, s(t) is fixed and known. Perhaps the most natural optimality criterion is then to maximize 
the probability of detection when the signal is present for the probability of detection when no signal is present being 
smaller than some preassigned false alarm probability. This so-called Neyman-Pearson criterion leads to the use of 
the amplitude Q of the signal, optimally estimated [jllj from the scalar product 

Q = (y(t),s(t)), (3) 

as the statistic on which the decision between Eq. (||) and Eq. (||) is made. 

In the case where s(t) £ W for some function space W, one can integrate the signal distribution to reduce the 
problem to the simple case of discerning between two fixed probability distributions, as is done when s(t) is fixed and 
known. This requires the knowledge of the prior distribution p[s(t)], and results in using the likelihood ratio 

Aj/W = — fTTTini 4 

p[y(t)\Q] 

as the detection statistic. When the prior p[s(t)] is unknown, the choice of an optimality criterion is not as simple as for 
the fixed signal case; the modified Neyman-Pearson criterion maximizes the minimum of the probability of detection 
over all s(t) £ W, for the false alarm probability being below some preassigned value, and is therefore an interesting 
"conservative" choice. It is well-known that the optimal algorithm in terms of the modified Neyman-Pearson criterion 
is obtained by choosing the prior p[s(f)] that is least-favorable, i.e. the prior that minimizes the minimum over W 
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of the probability of detection when the signal is present, at fixed false alarm rate |L2]. In particular, using the 
generalized likelihood ratio 

A'b,(t)] = max PfoWlf )] (5) 

L!A n s(t)ew p[y(t)\0] V ; 

as the detection statistic does not guarantee optimality |D| . 

For signals with excellent theoretical models, such as binary neutron star or black hole inspirals, the function space 
W is compact enough that the least-favorable prior is approximately uniform in s(t), so that the generalized likelihood 
ratio [Eq. d|)] derives from Eq. (m as the optimal detection statistic. This leads to a convenient implementation Jl3] 
which simply thresholds on the maximum of the correlation [Eq. ([|)] over a filter bank, which is a discretely sampled 
version of the function space W. 

For the case where the signal space is not simple enough to allow matched filtering, that is when the signal is not 
well-modeled, a number of incoherent methods have been proposed in the literature. One of them is the so-called 
excess power detector Jl5[ ] , which basically threshold on the power integrated over a large number of different shapes 
at different positions in the time-frequency plane. In their discussion of the optimality of the excess power detector, 
the authors of ]lq ] use a prior uniform in the "whitened" signal subspace of all waveforms of finite duration and 
bandwidth, which is imposed from physical arguments, and their result is therefore only a proof of optimality with 
respect to that particular prior. In particular, this prior is not necessarily least-favorable, and their result is therefore 
not a proof of optimality in the modified Neyman-Pearson sense. Similarly, the author of |ll| chooses a different prior, 
which is uniform in the "unwhitened" signal space, to derive a detector which is optimal with respect to this prior, 
and which is similar to the excess power detector, but is perhaps better adapted to colored instrumental noise. 

In addition to these detectors, a number of ad hoc methods have been proposed to solve the transient detection 
problem for unmodeled transients. They are based on the analysis of patterns in the time- frequency plane |l7], [l8| , 
or on the time-domain analysis of the data with various filters (jl9| . Only the authors of (l9) discuss optimality, from 
a numerical point of view, i.e. by using a small number of simulated signals from Q that are injected into noise in 
numerical simulations in order to compare the performances of a few different detection algorithms. 

When the noise is Gaussian, the likelihood ratio in Eq.fl) can be rewritten as [pOl: 



A[y(t)} = exp 



1 f J 

i y{t)s{t)dt --J 



\t)dt 



(6) 



where s(t) is the causal minimum mean-square error estimator of s(i), i.e., 

6(t) = E[s(t)\y(T)], (7) 

for £'[-|y(r)] the expectation over the noise given the observation y(t) for r < t, assuming the model of Eq. ([[]). 
The symbol f- represents the Ito stochastic integral [ pl|_. Of course, the computation of s(t) in Eq.(^) requires the 
knowledge of the prior p[s(t)], but the structure of EqT(||) suggests that an efficient approach to problems for which 
the integration in Eq. (Q) cannot be carried out effectively (due to a complex or unknown prior, for instance) might be 
to develop efficient estimators of the signal, and to use them as filters to test for the presence of signals in the data 
(i.e., to use an estimator- correlator design). 

It is shown in |2^| that (i) transforming the (discretely sampled) observations {yi : i = 1, N} to a wavelet basis, 
(ii) applying on the transformed observations {yi} the non-linearity 

if \m\ < V 

V 'tfyi>V (8) 

V if Vi < r), 




for a threshold r\ ~ log (AT) /N, and (iii) transforming the truncated observations {Si} back to the time domain 
lead to an estimator that is optimal in the sense of giving the smallest value of the maximum of the expected 
least-square error over a wide class of signals. Using this estimator and preserving the structure of Eq.(^J), a simple 
quantity to threshold on is 



JV 

E 



s 2 



ViSi --*■), (9) 



which can be rewritten as 



N 



^ \ 2 



v\k\ ■ (io) 
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The algorithmic structure of TFCLUSTERS corresponds to this model; the two principal differences are that a short-time 
Fourier basis is used instead of a wavelet basis, and that an additional step involving the analysis of the correlations 
between the non-zero components of the estimator s is introduced. 



III. ALGORITHM 



It will be assumed that TV data corresponding to the gravitational wave strain are read and stored as real numbers 
into a vector y. The data are time ordered and uniformly sampled with sampling frequency f s , so the i th component 
Hi of the vector y is the value of the measured strain at time ij f s : yi = y(i/ f s ). 

Step (i) of TFCLUSTERS is the construction of the time-frequency representation of the data from the spectrogram: 

P ij (y) = \y ij \ 2 , (li) 

where jjij is the j th component of the discrete Fourier transform of the i th segment of data of length M, yi = 
{yk ■ k = 1 + (i — 1)M, ...,iM}. M is assumed to be even and to be a factor of N, so that i — 1, ...,N/M, 
and j = 1, M/2 + 1. j = 1 and j — M/2 + 1 correspond to DC and to the Nyquist frequencies, and will not be 
considered anywhere below. Consequently, the maximum number of useful pixels in the time-frequency representation 
will be N s — (M/2 — 1)(N/M). The time resolution T of the time-frequency representation is fixed and is given by 
T = M/ f s . The frequency resolution F is simply F = X/T. In this time-frequency representation, the two hypotheses 
expressed in Eqs. (|l|) and (||) become Pij(y) = |Sy + en^] 2 and Pijiy) = \enij\ 2 , respectively. 

If the noise n is Gaussian and white, the power in different pixels is statistically independent. When no signal is 
present, Py is the sum of the square of two independent Gaussian variables with zero mean and equal variance, and 
is therefore exponentially distributed: 

PP,(P)= e -^ L - (12) 

When a signal is present, the Gaussian variables have non-zero mean, and the probability density function (hereafter, 
pdf) of the power is p3| : 

e -(P+\»*t\')/>?I 

m,0P||%l 2 ) = -. (13) 

where 1$ is the modified Bessel function of order zero. 

The spectrogram representation is of course not the only time-frequency representation available, and may not be 
optimal for most signal. It is however the simplest one to work with in this exploratory work. Wavelet bases share the 
independence property of the noise in the different pixels, as do any orthonormal basis, and they are known to offer 
better localization properties than Fourier transforms for many signals p3]. However, their dyadic representation 
makes their analysis more complicated, due to the varying shape of the pixels with scale. Another classical way 
to improve the spectrogram representation is to use windows and to overlap the segments. This is a good way to 
reduce artifacts from Fourier transforms (e.g. edge effects) and to increase the time resolution of the time-frequency 
representation, but at the cost of destroying the statistical independence of the pixels. This independence is essential 
for the calculations involved in the clustering analysis presented in section [v| and for the correct interpretation of 
some of the TFCLUSTERS thresholds. Any practical implementation of TFCLUSTERS can, however, deal equally well 
with spectrograms built with windows and overlapping segments as it does with the simpler type of spectrogram 
described above. 

Step (ii) of TFCLUSTERS consists in applying a threshold on the power Py . Pixels with P^ > r\ are called black 
pixels, and other pixels are called white pixels. The probability for any given pixel to be black when no signal is 
present (the black pixel probability, p), is given by: 

p = exp(-77/e 2 ). (14) 

Figure [l] illustrates the result from Step (ii) and the effects of Step (iii) and (iv) below on simulated data. 

Step (iii) of TFCLUSTERS considers the clustering of the black pixels. A cluster is defined as a set of black pixels 
containing all the black pixels that are the nearest neighbour of any pixel in the set. For a given pixel, its nearest 
neighbours are the pixels immediately to its left and to its right (time steps immediately before and after), and above 
and below it (frequency difference equal to the spectrogram resolution F). Two pixels touching only by a "corner" 
are called next nearest neighbours. The size S of a cluster is simply defined as the number of black pixels it contains. 



5 



The notion of distance d c between two clusters Ti and T2 is defined as the minimal distance d p between any two 
pixels pi and p 2 in the two clusters, 

d P (pi,P2) = \h - iz\ + \ji ~h\, (15) 



d c (T 1 ,T 2 ) = min d p (p 1 ,p 2 ), (16) 
piGri,p 2 er 2 

where (ik>jk) ar e the coordinates of pk, i.e., p\ corresponds to Pi 1 j 1 (y), etc. Hence, nearest neighbour pixels have 
dp = 1, next-nearest neighbour pixels have d p = 2, and any two clusters must have d c > 2. This choice of distance is 
made for convenience, but it has the implication of making the distance isotropic in the spectrogram representation 
of the time-frequency domain, irrespectively of the actual values of its time and frequency resolutions. Building a 
spectrogram with very long time bins and therefore very narrow frequency bins would have the effect of making the 
distances "longer" in the frequency direction than a spectrogram with short time bins and wide frequency bins. 

Thresholds are applied both on the size of the cluster and on its distance to other clusters. The latter is easily 
motivated for physical signals; for example, although clusters of size two that are produced by fluctuations in the 
noise could be likely in a certain experiment, and therefore be below the size threshold, to have two or more such 
clusters close from each other, say being next nearest neighbours, could be rather unlikely. Hence, a signal with a 
well defined frequency that is slowly varying, so that its time-frequency track is a thin curve, could easily produce an 
archipelago of clusters of size two that is statistically unlikely to be produced by noise fluctuations alone. 

If a cluster Ti has size Si > er, it is immediately accepted as significant. Otherwise, its distance to all other clusters 
T2 with size S 2 < er is compared to a threshold fo,^ which depends explicitly on the size of the two clusters. All the 
clusters with d c (ri,r2) < Ss 1 ,s 2 are merged into a generalized cluster, which is declared significant. If T 2 is already 
in a separate generalized cluster, Ti is added to that generalized cluster. If no cluster I^ with S 2 < a satisfies the 
distance criterion, Ti is rejected. For a given choice of the minimum cluster size er, there are 0(0 — l)/2 distance 
thresholds Ss 1 ,s 2 i which will be organized below as a vector 6: 

8 = [$Si,s 2 } = [$i,i, #1,2) — > ch>-i) £2,2, 8 2 ,<t-i, — , cV-i, CT -i]- (17) 

Finally, step (iv) of TFCLUSTERS considers significant clusters from step (iii) and threshold on their excess power Q, 
which is defined as 

Q= E ( p M-v), (is) 

(»,i)er 

for any given cluster T of size S. If no signal is present, i.e. if the cluster T results from a fluctuation of the noise, 
the pdf of Pij{y) after the thresholding of step (ii) will be a truncated exponential for E T: 

^ > 11 (19) 
otherwise. 

The pdf of Q will be the convolution of S such distributions, i.e. 

^Y' 1 i{ P>S r ) 



Pq(P) = S Ts^Ty. J i 2 u-rson ( 20 ) 

I otherwise. 



Hence, setting a size-dependent threshold Qs on Q, defined by the integral equation 



s, (S-1V\ e> ) (21) 

leads when only noise is present to the rejection of a fraction a of the clusters that survived step (iii), independently 
of the cluster size S. 

Step (iv) is very similar to the scheme used in the excess power algorithm of The essential difference is that 
TFCLUSTERS chooses the pixels included in the computation of the excess power by using a threshold on the individual 
pixel power and a clustering analysis, while |H| use a fixed set of "masks" of different shapes that are translated in 
the time-frequency plane, and over which the power for all pixels is integrated. The definition of these masks requires 
some prior expectations about the signal characteristics, and when this prior information is not sufficient to constrain 
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the signal shapes to look for in the time- frequency plane, the selective approach of TFCLUSTERS is expected to become 
more efficient than the excess power approach of ]15| . 

It should also be noted that other kinds of thresholds on the total cluster power could be used in step (iv) . While 
the one presented here penalizes in a similar fashion clusters of all sizes (it reduces the false rate by a independently 
of S), a size independent threshold could also be used, for instance, in which case large clusters would be more likely 
to survive the last step of TFCLUSTERS. 



IV. OPERATING CHARACTERISTICS 



Assessing the optimality of TFCLUSTERS in the modified Ncyman-Pearson sense is a very involved mathematical 
task, as it can be inferred from similar problems treated in the literature J25|. Nevertheless, it is shown in this section 
that the algorithm presents interesting properties that suggest its near-optimality for a large class of signals. What 
will actually be discussed is a simplified version of the algorithm not involving the clustering analysis (step (iii) of 
TFCLUSTERS). It is assumed that the clustering analysis will improve the performances of this detector when signals 
that form clusters in the time-frequency plane are indeed present. 

The binary test will be constructed by comparing the signal power estimate, Q(y) = |s| 2 , to a certain threshold 
Q > C w iH lead to the acceptance of Eq. ([I]) . This estimator is constructed by summing the power in the spectrogram 
over the pixels that have power above a certain threshold: 

i~..|2_/0 if Pij(y) <rj 2 

Pij (v) ~ V otherwise 

= (PiM-v)+, (23) 



0(y) = Efel 2 - ( 24 ) 



1,3 



The sum in Eq. (]24|) is performed over the whole time-frequency plane, and therefore is over N/2 terms. 

The following theorem, inspired from the work of p2| on signal estimation, is proved in appendix |A|: 
Theorem 1: 

Given the model Eqs. (p2|)-(p4]), for Q — |s| 2 , and provided that r\ = /3e 2 log7V/2, (3 > 1, there exists a series of 
numbers ttn with ttn — > 1 for TV ^ 1 such that Vs £ R^: 

(i) Pr(Q < Q)=n N , 

(ii) Pr(Q > q) = ttn, where q is any power estimator with Pr{q < Q') = ttn Vs' S Q(s), where Sl(s) is some 



neighbourhood of s defined in Eq. (A4), and Q' = \s'\ 2 . 



(iii) Pr(Q>Q- £\ j min(2/?e 2 log TV/2, \~s. 

This theorem leads to a number of observations: 



• Optimality: Theorem 1 shows the optimality of Q in the subset of detectors for which the condition stated in 
part (i) is respected: Q provides the tightest lower bound to Q that can be constructed from the data, for all s. 

• False alarm probability: For a threshold £ on Q, part (i) implies that the probability to detect a signal with 
power Q < C goes to zero asymptotically with N getting large. In particular, part (i) implies that the false 
alarm probability goes to zero asymptotically, since when no signal is present Q = 0. 

• False dismissal probability: The direct interpretation of part (iii) is that any signal with Q > ( + 
X;. (J min(2/3e 2 log/V/2, \s tJ \ 2 ), or equivalently with £ 4J (|%| 2 - 2/3e 2 log N/2) > C, will be detected with a 
false dismissal probability approaching zero. Obviously, if max|Sjj | 2 < 2/3e 2 logiV/2, part (iii) implies that 
Q = 0, giving a limit on how sensitive this detector can be. 

• Sparse signals: Also as a consequence of part (iii) , sparse signals that have only a small number of non-zero §ij 
are likely to be more easily detected, because they have a smaller value of ^ ■ min(2/3e 2 log7V/2, |Sy | 2 ) than 
signals with the same power spread over a larger number of pixels. 

The scheme discussed above of thresholding on the power Pij(y) integrated over pixels that have power above 
a certain threshold r\ thus possesses certain optimal properties. It is very efficient for signals that have a sparse 
representation in the time-frequency domain. It is unlikely, however, that the optimal properties are preserved when 
the signal subspace is restricted to signals that form clusters in the time-frequency domain; it that case, intelligence 
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about the spatial correlation of the signal pixels should certainly allow for more efficient algorithms. The idea behind 
TFCLUSTERS is that the ad hoc approach of merging the two-threshold scheme discussed above with a clustering 
analysis designed to effectively reject noise should lead to an efficient algorithm for such a signal subspace. 



V. CLUSTERING ANALYSIS 



On physical grounds, it can be expected for most transient sources that the pixels with excess power due to the signal 
will tend to cluster in the time-frequency domain. Short signals like black hole ringdowns or mergers, for instance, 
have durations roughly of the order of the inverse of the usable interferometer bandwidths (~ 1 kHz), and therefore 
appear as connected clusters of duration T and bandwidth equal to the search bandwidth. For longer signals that 
spend tens or hundreds of cycles in the interferometers band, the required stability of the source is likely to necessitate 
that the dominating mechanism for the emission of the waves is governed by rotation, with the wave instantaneous 
frequency being around J/ttI, for J the magnitude of the angular momentum along the principal axis of rotation, 
and / the moment of inertia of the source. Complicated dynamics may lead to the spreading of the signal power over 
some finite frequency interval, or even the formation of sidebands well-separated from the principal frequency of the 
waves (in this case, each sideband will be considered an independent cluster). When the source is not too far away, 
it is argued below that the amount of angular momentum radiated by gravitational waves for sources that are near 
the sensitivity limit of the interferometric detectors is insufficient to cause a change in the rotation frequency that is 
rapid enough to produce disconnected pixels in the time-frequency representation of the signal. 

Over a time interval T corresponding to one time slice in the time-frequency representation, a source at a distance 
r from the Earth that emits uniformly in all directions waves of frequency / and characteristic strain amplitude h 
will radiate a total amount of energy AE, where § 

AE~ir 2 ^fh 2 r 2 T. (25) 

The change A J in angular momentum magnitude corresponding to the emission of the waves is related to the amount 
of radiated energy by 

AE 

AJ ~ — (26) 

717 

if most of the radiation is quadrupolar p6)| . On the other hand, for sources with dynamics dominated by rotation, 
the second time derivative of the mass quadrupole moment is bounded by Q < nJf, so for h = 2GQ/rc : , 

c 4 hr 

3> -^gT (27) 

The value of the characteristic amplitude h is expressed in term of the noise power spectral density S n (f) so that the 
signal-to-noise ratio for a signal with bandwidth 1/T is unity, corresponding to a marginally detectable source: 

h = S^W/TV*. (28) 
: '(7TT-)( TtLzY .f^ff-iJ ( TTT-V 72 , (29) 



Combining Eqs. ©-©: 
\AJ 




J 

where the numbers in Eq. ( |29|) for S n (f) and / correspond approximately to the minimum of the noise spectral 
density of the interferometers being presently developed, and the value of T is chosen to match the expected time 
resolution of the time-frequency representations to be used on actual data. For values of r that are sufficiently small, 
a source does not need to radiate gravitational waves at a rate that produce variations in its angular momentum 
magnitude that are significant in order to be detectable. In order of magnitude, \AJ\/J ~ |A/|//, for A/ the change 
in the wave frequency over a time T, provided that the source doesn't change its moment of inertia by a large fraction, 
which is unlikely to happen over timescales of 0.1 s, except perhaps near the end of the gravitational wave signal (as 
e.g. in binary inspirals, at the innermost stable circular orbit), at which point the question of the amount of clustering 
becomes irrelevant. Hence, the wave frequency is not expected to vary enough over the time resolution T to generate 
pixels in contiguous time slices that are disconnected. This can happen if 
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Combining Eqs. (^9|) and (^) gives a necessary (but not sufficient) condition for pixels to be disconnected: 

>> nnn „ ^100 Hz\ 3 / 5 • lO- 23 R Z -^ 2 \ /0.1 s\ 3/2 
r > 300 Mpc — =— — — — . (31) 



/ / V Sl' 2 {f) J V T 

In this context, any rotation dominated source at the detection limit of the detector that is closer than r is expected 
to form a cluster in the time-frequency domain. Figure |^ shows the variation of r with frequency, for the noise spectral 
density of the LIGO interferometers pTj . Sources with / < 300 Hz will form detectable clusters even if they are as 
far as the Virgo cluster; those with / <fl250 Hz, even if they are in the galaxy, may not form clusters. 

These considerations show that under certain restrictions, at least two broad classes of sources (short impulsive and 
rotation dominated at the detection limit) should lead to signals that form clusters in the time-frequency plane. As it 
will be shown below, Gaussian noise will have the opposite property, in the sense that black pixels will tend to fill the 
plane uniformly, without forming large clusters. This will be used as a powerful basis for denoising the thresholded 
spectrograms computed from the data. 



A. False Alarm Rate 



The reader familiar with the mathematics of percolation theory in statistical physics has most likely recognized at 
this point the applicability of the results in that field to the problem under investigation here. In particular, for a 
black pixel probability p, the average number of clusters of size S per pixel of an infinite image is: 

(n S (p)) =P S D s (l-p), (32) 

where D$ is the so-called perimeter polynomial, and is related to the number of shapes a cluster of size S can have 
(counting shapes related only by a translation as identical), that is to the degeneracy gsR of a cluster of size S and 
perimeter R: 

Ds(q)=Y,9SRq R - (33) 

R 

As it can be seen from Eq. (p3|), the perimeter of a cluster is the number of white pixels having at least one black 
pixel in the cluster as a nearest neighbour. Coefficients of the perimeter polynomials for cluster sizes up to S = 22 for 
nearest neighbours on the square lattice are tabulated in pq | , and were mostly generated through the use of computer 
enumeration techniques. Note that (ns{p)} is simply the probability of any pixel to be in a cluster of size S, divided 
by S [f§. 

For low cluster densities, the expected number of clusters per unit time A, i.e. the cluster false alarm rate, is related 
for a threshold a on the cluster size to the time and frequency resolutions, and to the bandwidth B that is searched: 

In practice, because of the rapid decay of (ns(p)) with S, the sum can easily be truncated without major losses of 
precision. Figure || illustrates the nearly exponential decay of the cluster rate with the threshold on the cluster size, 
for different black pi xel p robabilities. 

In analogy to Eq. (32), the average number per pixel of pairs of clusters of size Si and S2 separated by a distance 
d is defined as: 

(vs 1 ,sM)=P Sl+S2 H§ 1 s 2 (l-Pl (35) 

where the polynomial Hg iS2 is related to the number of configurations kg ± g 2 fi(d) for a cluster of size Si to be within a 
distance d from a cluster of size S2, where the sum of the perimeters of the two clusters is R, and where configurations 
that are related only by a translation are again considered identical: 

H d Sl S 2 (<l) = T, k ^S 2 R(d)q R - (36) 

R 

As an example, for two clusters of size 2 separated by a distance of two, Eq. ( |35|) becomes: 

(^2,2(2)) = . (37) 
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The factors in the numerator account for all the possible configurations for a cluster of size 2 to be a distance 2 from 
another cluster of size 2, and the factor of 4 in the denominator is the total number of pixels, necessary in order not 
to overcount clusters related by a simple translation. The general expression for d > 2 is 



("2.2(d)) =8(d- 



i)A 12 , 



and hence the associated false alarm rate is 



^2,2 



B 

TF 



10pV° + i W 



^2,2 

d=3 



8(d + l)A 



B 
TF 



[io P y° 



35 2 



10)A 12 ] , if 62,2 > 3, 



(38) 



(39) 



(40) 



where 62.2 is the threshold on the distance for two clusters of size 2 to be considered correlated. 

Table | gives the coefficients ks 1 s 2 n{d) for small cluster sizes that were obtained by computer enumeration. General 
relations can easily be deduced from this table: on the square lattice, using the definition of distance from Eq. @,the 
diameter of a circle of radius r is 4r, and therefore the number of points at distance d from a certain cluster increases 
linearly with d. Moreover, when d > 2, both clusters are guaranteed to not be sharing any perimeter white pixels, 
and therefore the number of configurations with fixed total perimeter consists of a "bulk" part (from the constant 
number of configurations occurring at pixels that are on the same row or column as a pixel in the cluster) , and of a 
part growing linearly with d (from the "diagonals" of the curve of constant distance from the first cluster) . Table || 
gives the formula for ("s 1 ,s 2 (d)) deduced from table |, for d > 2. Values for d = 2 can be read directly from table 

Figure || shows how the expected number of pairs of clusters of size Si and 52 (i.e., J2d=2 S2 ( u Si,s 2 (d))) increases 
when the threshold 6s 1 ,s 2 is increased. As expected, it also shows that pairs of large clusters are more unlikely to 
form than pairs of smaller clusters, and that the expected number of pairs of clusters decreases when the black pixel 
probability is reduced. 

It should be noted that TFCLUSTERS includes higher order terms (n-cluster configurations, n > 3) into its definition 
of generalized clusters. These complex co nfig urations are built by merging all the clusters satisfying the distance 
criteria, and consequently the sum of Eq. ( p4[ ) and of its equivalent for Eq. (35) overestimates slightly the true false 
alarm rate, especially when the distance thresholds 6s 1 ,s 2 are large for the density of small clusters. 



s 



B. Efficiency 

As shown above, the false alarm rate is computed analytically; when a = (cf. step (iv) of TFCLUSTERS), the 
probability of detection of a certain signal can also be computed analytically. It is not possible at this time to model 
the detection efficiency when a > because the equivalent to Eq. (|2^) when a non-null signal is present is not known 
analytically and is required to compute the probability of detection of a given signal. The sensitivity and false alarm 
rate of TFCLUSTERS depend on a number of parameters: F, T,p, a and 5. Using some general information about the 
expected waveform, it is possible to run an optimization over all these parameters to maximize the probability of 
detection at fixed false alarm rate. More generally, it is possible to run multiple versions of the detector, each one for 
a different set of parameters, in order to cover as many different classes of signal as possible. The number of classes 
that can be considered is limited by the computational power available for the analysis, and is somewhat limited by 
the fact that results are not independent. 

For comparison, the probability of detection using the ideal power detector can also be computed. This detector 
assumes that the bandwidth and duration of the signal are known and that their product is the time- frequency volume 
V, and it compares the excess power computed according to these parameters to a threshold n. In order to get a 
false alarm rate rather than a false alarm probability, it is assumed that the detector is applied with frequency I/r 
on segments of independent data of length r, and that all frequency bins in the search bandwidth B are considered in 
disjoint groups having bandwidth <f>. This detector is incoherent and is therefore not as efficient as matched filtering, 
but it still performs better than any detector that can be implemented when neither the duration nor the bandwidth 
is known. The false alarm rate is: 

( ' \L ~, dP, (41) 

and the probability of detection is [^3| 

(vy 



> / (t^) (V ~ 1)/2 ^(- P + V / L )h (42, 
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The case where one or more of the bandwidth, duration, central frequency or arrival time parameters have to be 
searched over is more complicated to analyze: the probabilities are not statistically independent, because of the need 
to search over overlapping regions in the time-frequency representation of the signal, and therefore the computation 
of the false alarm rate is rather involved. 

An example is now presented to make things more precise regarding how well TFCLUSTERS compares to the ideal 
power detector. The signal is taken to be of duration 6T and bandwidth smaller than F = 1/T (since the problem 
is symmetric under the interchange of the time and frequency axes, the results below are equivalent to the case of a 
short signal of duration T and bandwidth 6/T). It is assumed that the signal power is distributed uniformly over its 
full duration and bandwidth. As a definite example, let the bandwidth of the search be chosen to be B = 512F, and 
the false alarm rate be A = 1/3600T. The central frequency is assumed to be such that the power is concentrated in 
a single constant frequency row of the spectrogram, while both the case of synchronized (the signal covers six pixels) 
and the case of random arrival time (the signal covers seven pixels) are considered. For the ideal power detector, 
V = 6 or 7, and r is set to 6T. For the TFCLUSTERS algorithm, the various thresholds (77, a and Ss 1; s 2 t a being zero) 
are optimized for every value of the signal-to-noise ratio in order to maximize the probability of detection (hereafter, 
POD) for the constraint A < 1/3600T. The details of the calculation are presented in appendix ||. Figure || shows a 
comparison of the optimized POD as a function of signal to noise ratio p, for TFCLUSTERS and the ideal power detector. 
Only two sets of clustering analysis thresholds cover the whole range of signal-to-noise ratios of figure [|, illustrating 
the relative independence of the performances of TFCLUSTERS on its numerous parameters. Given the fact that the 
chosen signal was a line in the time-frequency plane, and was therefore the shape that is the easiest to "break" by 
changing a black pixel into a white pixel, the example presented here can be considered as a difficult situation; most 
other distributions of the power (i.e. other signal shapes) would make the performances of TFCLUSTERS and of the 
ideal power detectors only closer. 



VI. NUMERICAL SIMULATIONS 



Extensive numerical simulations were carried out in order to confirm the validity of the analyses presented in the 
previous sections, and in order to explore properties of TFCLUSTERS that are hard to study analytically. All of the 
results below were produced using the same method: segments of Gaussian white noise of unit variance are produced 
using a random number generator | p0| . The segments are of duration 10 4 s, and are sampled at 2048 Hz. These 
numbers are chosen so that processing a single segment uses most of the RAM of the computers on which the code is 
running. On every segment, the implementation of TFCLUSTERS within the LIGO Scientific Collaboration Algorithm 
Library [ pl| is used to generate a list of significant clusters, according to some pre-specified values of the various 
thresholds. The 90% central confidence interval for the rate of significant clusters in the data, assuming they form 
a Poisson process, is computed based on the number of detected clusters, using the standard Neyman construction 
p2| . If the ratio of the width of the 90% confidence interval to its central value is smaller than 1%, the simulation 
terminates. Otherwise, a number of other 10 4 s segments are processed, until the above termination criterion is met. 
For a true rate of A, the termination criterion requires to process of order 10 5 /A worth of simulated data. 

It should be noted that the timing distribution of clusters from white noise is indeed very well approximated by 
a Poisson distribution. This is confirmed in one specific case by figure ^ which shows that the distribution of the 
time delay A between two successive clusters follows an exponential distribution, which is the defining property of a 
Poisson process. 

The simulations were performed on a cluster of workstations with 1GHz Intel Pentium III "Coppermine" processors, 
512 Mbytes of PC-133 RAM memory, with the Linux operating system. For a large range of parameter values, the 
processing time per CPU was on average 250 to 550 times shorter than the duration of the data segment. Most 
of the time was spent at grouping the black pixels into clusters, and quite logically the most important factor in 
determining the speed of TFCLUSTERS was the black pixel probability; independently of the other parameters, the 
ratio of processing to real time was around 300 for 77/e 2 = 2, around 500 for ry/e 2 = 4, and was increasing almost 
linearly with the power threshold 77. 



A. The Case S = 0, a = 

When 8 = and a = 0, the expected rate is given by Eq. (|34]). Figures fj] and || show the excellent agreement 
between rates from simulations and predictions from Eq. (34). The two agree to better than 0.5% most of the time, 
commensurably with the precision of the simulations. The sum in Eq. ( |34| ) is of course dominated by the first few 
terms, but values of (ns(p)) as predicted from Eq. (|32|) describe the simulated data also very well for large values of 
S, as shown in figure |[ 
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B. The Case S / 0, a = 

For 5 7^ 0, there is a contribution to the cluster rate from both Eq. ( |34| ) and its equivalent for (vsi,s 2 (d)}. Figures 
|l0| and |ll| show the good agreement between rates from simulations and predictions. When i]/e 2 > 2.8, the agreement 
is at the precision level of the simulations, although there is a systematic overestimation of the measured rate by 
the predictions for ri]e 2 < 3.7. This overestimation, however, reaches almost 20% for ij/e 2 — 2; this is expected, as 
explained in section |V|, because for rj/e 2 < 2.8 the cluster density is high enough that higher-order combinations above 
the 2-cluster one are likely to be produced. 

To demonstrate that the error is indeed due to higher-order terms, a histogram of the contribution A(S) to the 
total rate of generalized clusters of size S is presented in figure [l^, together with the theoretical prediction based 
on 1- and 2-cluster configurations. The predictions systematically overestimate the measured rate for S < 8, and 
underestimate it for S < 8. It should be noted that for S = [0, 0, 0, 0, 0, 0, 2, 3, 4, 4], 3-cluster configurations can have 
sizes 8 < S < 12, while 4-cluster configurations can have sizes 10 < S < 16. Hence, one can expect a small error for 
5 = 8 corresponding to the 3-cluster (Si, S2, S3) = (2, 2, 4), and no errors from higher-order terms for S < 8, i.e. for 
the terms dominating the sum leading to the prediction for A. Overall, more small clusters than expected get merged 
into generalized clusters due to these high-order terms, and the total rate is consequently overestimated. Figure [l3| 
quantifies the importance of high-order terms as a function of the cluster density, i.e. as a function of the threshold 
•q; as expected, the relative importance of high-order terms becomes smaller than the 1% level around 77/e 2 = 3, in 
agreement with the results in figure [HI 



C. The Case a > 0, and Finite-Size Effects 



For a > 0, the rate is expected to be (1 — a)Ao, where Ao is the value of the rate for the same parameters, except 
that a = 0. Figure |lj illustrates the reduction in A as a is varied. The results are as expected from Eq. 
the errors from the simulations. 

The simulations presented in figure [l4| were carried out with a very asymmetrical spectrogram: T was chosen to 
be 1/32 s, so F — 32Hz, and the 992 Hz bandwidth was covered by only 31 pixels, while a segment had 32 ■ 10 4 bins 
in time. Because of that, significant finite size effects are expected; the rate predictions are based on the assumption 
that the time-frequency plane is infinite, but in the present case it is small enough in the frequency dimension that 
clusters are likely to be "clipped" and therefore to be smaller then expected. Consequently, it is expected that the 
predictions will again overestimate the measured rate, and this is what is observed in figures [l4| and [l5|. As it can be 
seen from figure [l5|, the fractional error from the prediction is still smaller than about 10% when the cluster density 
is low enough. 



VII. CONCLUSIONS 



TFCLUSTERS, a new time-frequency detector for bursts of gravitational radiation in broad-band interferomctric 
observatories, was described in some details in this paper. The behavior of the detector when applied to white 
Gaussian noise in the absence of signals was carefully analyzed, leading to a formalism for the computation of the 
false alarm rate of TFCLUSTERS for any values of its many parameters. The results from numerical simulations showed 
that this analysis was accurate in most situations: errors at the 1% level or better were obtained in "ideal" situations 
(low cluster density, large number of frequency bins), and errors of the order of 10% appeared in situations were 
the analysis was expected to be less accurate, due to high order terms not included in the sums over the cluster 
configurations, or due to finite size effects from the limited bandwidth of the search. In the case where the errors 
were large, the analysis presented in this paper was systematically overestimating the false alarm rates; should they 
prove from practical work to be necessary, more accurate estimates could therefore be obtained from more careful 
calculations, or from larger scale numerical simulations. While the false alarm properties of TFCLUSTERS are well- 
understood, the efficiency of the detector is subject to more uncertainties. 

A calculation that was presented in Appendix [B| showed that the efficiency of TFCLUSTERS was comparable to that 
of the ideal power detector, unfortunately illustrating at the same time the mathematical complexity associated with 
producing such an estimate of the efficiency of the detector for a given signal. Nevertheless, the fundamental approach 
used by TFCLUSTERS, namely the use of an adaptive power integral over pixels with excessive power, was shown to 
be maximizing asymptotically the estimate of the power in the signal, provided it is not overestimating it, over all 
possible estimators of the same quantity. This naturally suggests optimal properties for TFCLUSTERS as a detector. 
However, this can be nothing more than a conjecture at this point, as the actual proof of the optimality of TFCLUSTERS 
in the (modified) Neyman-Pearson sense is of great difficulty. 
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Independently of the question of the optimality of TFCLUSTERS, the structure of the algorithm is particularly 
practical for its implementation for the analysis of actual data. The first power threshold on individual pixels (step 
(ii)) can be chosen to be frequency dependent in order to allow the analysis of colored Gaussian noise when whitening 
of the data is not convenient; the errors on the black pixel probability that are introduced by the non-zero correlations 
coloring the noise are generally negligible. Moreover, frequency bands containing spurious interferences are easily left 
out of the analysis, with minor modifications to the algorithm presented in this paper. 

The search for gravitational waves will of course require the operation of TFCLUSTERS in coincidence at sites that 
are geographically separated and the use of information from auxiliary data that do not couple to the gravitational 
wave data, in order to avoid possible false detections due to non-Gaussian noise. A simple but complete analysis 
system using TFCLUSTERS and satisfying these requirements was developed in (3^] in order to compute upper limits on 
the rate of occurrence of events that are expected to radiate gravitational radiation. A comparison of event lists from 
the individual detectors was used to carry out this coincidence analysis. This approach may, however, not give the 
best network detection efficiency, in part because of the relatively coarse time resolution of TFCLUSTERS. Enhanced 
versions of the algorithm, designed to analyze coherently data from many detectors, might prove to be more efficient. 
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APPENDIX A: PROOF OF THEOREM 1 



Consider first the deterministic equivalent to Eq. (Q): 

y = s + Su, 

where u is a nuisance parameter, so that the equivalent to Eq. (|ll]) is 

p ij(y) = l*« + Su t j\ 2 , 

where Uij £ C and where by definition \uij\ < 1. 
Lemma 1: 

For the model described by Eq. (Al) with S 2 = rj, Vs £ R N and Vy satisfying Eq. (Al), Q(y) < Q. 
Proof: 

Consider Eq. (§§). Trivially, if \s tl \ 2 = 0, then \s u \ 2 < |%| 2 . For \sij\ 2 > 0, |s y | 2 = P i3 {y) - tj. From Eq. (|A§), 
\sij\ 2 > Pijiv) — 5 2 . Since 5 2 = r], this gives \§ij\ 2 > |sy \ 2 for all s, y. Summing over i and j gives Q < Q. 



(Al) 
(A2) 



Lemma 2: 

Given Eq. (Al) with S 2 = 77, Vs £ R N , and Vy respecting Eq. (Al), Q(y) > q(y), where q(y) is any power estimator 
satisfying 



q{y') < Q,Vs' £ f2(s) and Vy' respecting Eq. (Al), 



(A3) 



where 



n(«) = < s' e R N : \s - s'\ 2 < min 4<5 2 , 2S\S, 



1 + 4/1 



26 



(A4) 



Proof: 

Suppose 3y° such that q(y°) > Q(y°), so = s ^ + 6u®j. One can construct a signal s' such that 

V^-Sur. if|y°|><5 



'v 



otherwise. 



(A5) 
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The freedom provided by u'^ S C is used to choose all values of u'^ so that • and s'^ are orthogonal. If s' „- = 0, the 
direction of u'^ is unimportant. For all i, j, = 1. Note that this choice of u'^ and 5y satisfies Eq. ( Al). 



\ - 4 = 5 (4 - fi «). 14 - 4i 2 ^ 4(52 • Now ' 141 2 = 52 + i4f J 



1^-41-141+1^1-2(4'^) ( A 6) 
<2|.5°.| 2 + 2|.5°.|^ + 2|SO| 2 Ji + t^t. (A7) 



Combining these inequalities and summing over i,j shows that s' S O(s) 
Then, 



Q(y°) = E(^(y°)-^)+ (A8) 

= E iJ -(l4 + H-l 2 - J ?)+ ( A9 ) 

= E iJ (l4l 2 + <5 2 -r?) ( A1 °) 

= W\ 2 - (All) 



It follows that q(y°) > Q(y°) implies q(y°) > |s'| 2 , contradicting Eq. (A3). 
Lemma 3: 

For the model described by Eq. QAlD with S 2 = r), Vs 6 and Vy satisfying Eq. (Al), Q(y) > 

Q-J2 itj mm(2S 2 ,\~ Sij \ 2 ). 
Proof: 



Obviously, \sij\ 2 - \s i:j \ 2 < \~s l0 \ 2 - Also, from Eq. Q, P y (y) > |sy| 2 - S 2 , so from Eq. (gg), 
fef - ^(y) - <5 2 > fef - 2<5 2 when |gy| 2 ^> 0. When |l y -| 2 = 0, S 2 > P l3 {y) > |s y | 2 - <5 2 , so |s y | 2 < 2<5 2 . 
Combining the three inequalities gives |Sij| 2 — < min(2<5 2 , |sjj| 2 ); summing over i,j proves the lemma. 



Proof of theorem 1: 

In the noise model from Eq. (|ll|), the power in the noise is distributed exponentially: 



P|fii . |a (P) = — - — . (A12) 
The corresponding cumulative probability distribution is 

P^(P) = l-e- p ^. (A13) 

Hence, the probability that the maximum of N independent realizations of that random variable is less than a threshold 
Cn is 

n N = Pr r m ax|fy 2 < Cn) = (l - e' " 1 ^ . (A14) 
For iV 3> 1, this can be rewritten as 

tin = cxp (-e log n -Cn/* 2 j ( Al5 ) 

for C^v/e 2 > log AT. Hence, for Cat — * e 2 log AT, this probability goes to 1/e as N becomes large. Moreover, any values 
of ttm > 1/e can be achieved with the right choice of Cn] for Cn = /3e 2 log AT, (3 > 1, 7Tjv — ► 1 for A 7 " 1. 
It should be noted that the event 

maxln^l 2 < /3e 2 logA^ (A16) 



implies a realization of the noise which can be mimicked by the deterministic noise model [Eq. (Al)] with 
S = e-y/fl log N. Hence, the three statements of theorem 1 follow from the three lemmata proved in this appendix. 
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APPENDIX B: THE NARROW-BAND SIGNAL EXAMPLE 



Given a u by v matrix Q with elements representing the distribution of the power in the signal in a rectangular 
sub-region of the time-frequency plane, the signal-to-noise ratio is defined by: 



2 _ Si.j 



P Z = ^f-F- (Bl) 

The elements of the matrix D representing the black pixel probability corresponding to the matrix Q are given by 
the integral of the density from Eq. (|l3|): 



Dij = / p Pij (P\Qij)dP. 



(B2) 



In general, not all pixels where some signal power is present will be black, and the signal will be detected only when 
a number of black pixels equal or larger than the threshold a form a connected cluster, or when a pair of smaller 
clusters are close enough. The contribution to the probability of detection of the signal of such a configuration will be 
the product of the black pixel probabilities (-Dij) and of the white pixel probabilities (1 — Dij) for "holes" . Although 
noise fluctuations could in principle help the detection of a signal by forming a "bridge" over regions where no signal 
is present, summing over these possibilities involve 2 UV terms. A slight underestimate of the probability of detection 
is instead used by summing only over the n pixels where signal is present (n < uv), which reduces the number of 
terms to be considered to 



n—cr t 

■v n! 



^ (n - n H )\n H V 

n H =0 



(B3) 



where nn is the number of holes. Of course, the enumeration process can be greatly simplified when the distribution 
of the power in the signal has some speci fic sy mmetries. 



Consider now the example of section |V B . Under the assumption that the starting time of the signal matches 
exactly the binning of the spectrogram used to detect it, it will be represented by a row of 6 pixels, each with an equal 
probability p to be black, neglecting effects such as power leakage. The columns labeled "POD (r — ► p, s — ► 0)" in 



tables III and IV give the contributions to the probability of detection (POD) of this signal from various thresholds. 
The problem is slightly more complex when the arrival time is taken to be random. In that case, the signal is spread 
in general over 7 pixels, with the central five having a black pixel probability p, and the leftmost and rightmost having 
smaller probabilities r and s, respectively, r and s are given by: 

poo 

r= / p Pij (P\P^dP (B4) 
Jn 

and 



p Pij {P\P s -P' s )dP, (B5) 

where P s is the power in the central five pixels, P' s is the power in the leftmost pixel, and p Pi .(P\Q) is given by Eq. 
@. The POD is then given by 



1 

P's 



p. 



POD(r(P^s(P s -P' s ))dP' s , (B6) 



where POD(r(P s '), s(P s — Pj)) is the probability of detection for the 7 pixels configuration; contributions to it from 
various thresholds can be found in the columns labeled "POD" in tables III and [lV|. For the ideal power detector with 
A = 1/3600T, r = 6T and B/<f> = 512, Eq. @ S ives K ~ 23.87e 2 for V = uv = 6 and k ps 25.55e 2 for V = uv = 7. 
Eq. ( [44 ) is then used directly to generate the ideal power detector curves of figure [|. 
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TABLE I: Coefficients ks 1 s 2 R{d), multiplied by Si + S2, for clusters of size Si and S2, with total perimeter R, and separated 
by a distance d. 
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TABLE II: General formulae for {vs 1 .s 2 {d)) for distances d > 2, for different cluster sizes Si and S2, as deduced from table | 
By definition, q = 1 — p. 
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TABLE III: Contributions to the probability of detection (POD) for different thresholds on the cluster size, for a straight line 
of 6 pixels, all having equal black pixel probability p (columns labeled "POD (r — > p, s — > 0)"), and for a row of 7 pixels, with 
5 central pixels having probability p, and the leftmost and rightmost pixels having probabilities r and s, respectively (columns 
labeled "POD"). By definition, q=l-p. 
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FIG. 1: Examples of the various steps of the TFCLUSTERS algorithm. From top to bottom, the time-frequency plane after Steps 
(ii), (iii) and (iv) is shown, for the same segment of simulated data. These data are white Gaussian noise sampled at 16384 
Hz with standard deviation 5 • 10 _23 Hz _1//2 \/1000Hz, with a mock signal from the coalescence of a binary consisting of two 10 



Mq black holes, as described in [|17j 
with maximum strain amplitude h 
5 = [0, 0, 0, 0, 0, 0, 2, 3, 4, 4], and a = 



Appendix A], injected at t — 8 s. The binary is placed at 20 Mpc, so the signal is strong, 
10~ 21 . For this example, the TFCLUSTERS parameters are T = 1/8 s, p = 0.1, a = 5, 



- 1.7- 
0.99. 
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TABLE IV: Same as table till for thresholds on the distance between two clusters of size Si and S 2 - 




f(Hz) 



FIG. 2: The maximal distance r at which rotation dominated sources at the sensitivity limit of the LIGO 4k interferometers 
are expected to form clusters in the time-frequency plane. 
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FIG. 3: The mean number per pixel of clusters of size greater or equal to the threshold a for different black pixel probabilities. 
Solid line: p = 0.1, dashed: p = 1CT 2 , dash-dotted: p = 1CT 3 , dotted: p = 1CT 4 . 



20 





FIG. 4: The mean number per pixel of couples of clusters of different sizes at distance smaller or equal to the threshold 8 for 
different black pixel probabilities. Solid line: p = 0.1, dashed: p = 10~ 2 , dash-dotted: p = 10 -3 , dotted: p — 10 -4 . 
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FIG. 5: The optimized probability of detection of the 6T long signal with bandwidth 1/T as a function of its signal to noise ratio 
p, for a false alarm rate A = 1/3600T, and a search bandwidth 512/T. The two rightmost curves correspond to the TFCLUSTERS 
detector. For p < 2.875, the optimal thresholds are: a = 3, S = [0,0,3]. For p > 2.875, they are: a = 4, S = [0,0,0,0,2,0]. 
The two leftmost curves correspond to the ideal power detector. The solid lines are for the case where the arrival matches 
exactly the spectrogram gridding (i.e., the signal occupies six pixels), while the dash-dotted lines correspond to the POD for 
the general case (power spread over seven pixels) . 
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FIG. 6: The probability density p(A) of the time delay A between two successive clusters, as measured empirically for 
r\ = 3.719e 2 , a = 5, 6 = [0, 0, 0, 0, 0, 0, 2, 3, 4, 4], and a = 0, for T = 1/32 s, B = 992 Hz. The two continuous lines correspond 
to the extrema of the predicted Poisson distribution, assuming the value of the rate at both ends of its 90% confidence interval 
((7.28 ±0.02) ■ 10" 4 Hz). 
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FIG. 7: The measured rate A in white noise for various values of r\, for S — 0, a = 0, T — 1 s, B = 1023 Hz. From top to 
bottom, the curves correspond to a — 3, 4, 5, 7. Both the error bars on A and the predicted rates from Eq. (Q) are occulted by 
the thickness of the line. 
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FIG. 8: The fractional residuals A A = (A prc d — A)/A prc d, where A prc d are the predicted rates, corresponding to figure [jj. 
X-marks, plus signs, circles and squares correspond respectively to a = 3, 4, 5, 7. 
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FIG. 9: The measured rate X(S) as a function of the cluster size S for 7? = 3e 2 , a = 5, <5 = 0, a = 0, T = 1 s, B = 1023 Hz. 
The plus signs are results from simulations, and the continuous line corresponds to the predictions. This plot was constructed 
from 7 • 10 6 s worth of simulated data. 




FIG. 10: The measured rate A in white noise for various values of 77, for a = 5, 6 = [0, 0, 0, 0, 0, 0, 2, 3, 4, 4], a = 0, T = 1 s, 
B — 1023 Hz. The dotted line correspond to the predicted rate. 




FIG. 12: The measured rate \(S) as a function of the cluster size S for r\ = 3e 2 , a = 5, S = [0, 0, 0, 0, 0, 0, 2, 3, 4, 4], a = 0, 
T — 1 s, B — 1023 Hz. The plus signs are results from simulations, and the continuous line corresponds to the predictions. 
This plot was constructed from 5 • 10 6 s worth of simulated data. 
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FIG. 14: The measured rate A in white noise for various values of r], for a — 5, S = [0,0,0,0,0,0,2,3,4,4], T = 1/32 s, 
B — 992 Hz. From top to bottom, the curves correspond to a = 0,3/4, 15/16. The dotted line correspond to the predicted 
rate. 
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FIG. 15: The fractional residuals defined as in figure but corresponding to figure [w], for the curve with 



